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Abstract 

A worst case distribution is a minimiser of the expectation of some 
random payoff within a family of plausible risk factor distributions. 
The plausibility of a risk factor distribution is quantified by a convex 
integral functional. This includes the special cases of relative entropy, 
Bregman distance, and /-divergence. An (e-7)-almost worst case dis¬ 
tribution is a risk factor distribution which violates the plausibility 
constraint at most by the amount 7 and for which the expected pay¬ 
off is not better than the worst case by more than e. From a practical 
point of view the localisation of almost worst case distributions may be 
useful for efficient hedging against them. We prove that the densities 
of almost worst case distributions cluster in the Bregman neighbour¬ 
hood of a specified function, interpreted as worst case localiser. In 
regular cases, it coincides with the worst case density, but when the 
latter does not exist, the worst case localiser is perhaps not even a 
density. We also discuss the calculation of the worst case localiser, and 
its dependence on the threshold in the plausibility constraint. 
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1 Introduction 


Let the monetary payoff or utility of some action, e.g. of a portfolio choice, 
be described by a function X{r) of a collection r of random risk factors. 
Suppose the probability distribution which governs the risk factors is not 
known exactly but may be assumed to belong to a set L of distributions on 
the sample space of scenarios r (multiple priors model). Then the worst 
case expected payoff 


X{r)F{dr) (1) 

may be taken as (the negative of) the model risk caused by the lack of 
knowledge about P. The same expression emerges also in the theory of 
preferences. Ambiguity averse decison makers may rank possible actions by 
the criterion of expected utility in the worst case over T. Risk measures or 
preference criteria of a more general kind involve penalised expected payoff 
or utility 

inf(L;p(A)+a(P)), (2) 

where a;(P) is a suitable penalty term. For details, including axiomatic 
considerations leading to m or m, we refer for example to Follmer and 
Schied [9], Hansen and Sargent [12], or Gilboa [TO] . 

Any risk measure satisfying some natural postulates (in which case they 
are dubbed coherent) can be represented as the negative of ([I]) for some 
convex set of distributions F. Relaxing coherence to “convexity” yields ([2]), 
with some convex penalty term a(P). For our purposes, axiomatic theory 
serves as motivation only. In that theory the inhmum in ([T]) typically equals 
a minimum. In models treated in this paper, a worst case distribution P € F 
attaing the minimum in ([T]) need not exist. 

If a “best guess” Pq of the unknown risk factor distribution is available, 
it is natural to use ([T]) with F consisting of those distributions P that do 
not deviate much from Pq. In the literature many measures of deviation 
of distributions are available; the majority are non-symmetric. The most 
versatile one, in various scientific disciplines, is /-divergence or relative en¬ 
tropy. For an axiomatic approach distinguishing /-divergence in the context 
of inference see Csiszar |7] and references therein. Relaxing some axioms, 
that approach leads as alternatives to other frequently used measures of 
deviation of distributions, known as /-divergences and Bregman distances, 
see Section [2] for definitions. In the context of risk and preferences several 
authors, perhaps first Hansen and Sargent have considered o with F 


inf Ep(X) = inf [ 
Per Per 
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equal to an /-divergence ball around Pq; or ([2|) with a(P) equal to a con¬ 
stant times the /-divergence of P from Pq. The preference relation based 
on ([2]) with this choice of a(P), called multiplier preferences in [12], has 
been axiomatically distinguished by Strzalecki |15] . Moreover, according to 
Ahmadi-Javid [T] the coherent risk measure he calls entropic value at risk, 
obtained by taking an /-divergence ball for T in ([T|), is superior to others 
from the point of view of computability. General /-divergences have been 
employed in this context by Maccheroni et al. US] and Ben-Tal and Tebonlle 
|2], see also references in |2| to prior work of its authors. Bregman distance 
could be used similarly but to this we do not have references. 

We consider problem ([T|) with T of the following form, including as special 
cases /-divergence balls, /-divergence balls and Bregman balls: 

r := {P : dP = pdjj,, H(p) < k} , (3) 

where /r is a given measure on 1/ and // is a convex integral functional as 
specified in Section 12.11 A corresponding choice of a(P) in ([2|) is a(P) = 
XH{p), A > 0. 

Our main focus in this paper is the location of the infimum, rather than 
the value of the worst case expected payoff ([T|) or the related infimum (|2|) . In 
cases the infimum is not achieved, there is no worst case distribution, then 
it is not obvious what the location of infimum should mean. We introduce 
the concept and prove the existence of a “localiser of almost worst case 
distributions”, which in the following sense characterises the location of 
the infimum, whether or not the minimum is achieved: almost worst case 
distributions achieving values ever closer to the infimum are in ever smaller 
Bregman balls around the localiser. Part of the results were presented in 
the symposium contribution |3]. 

The problem of minimising Ef{X) subject to H{p) < k is related to 
the problem of minimising convex integral functionals subject to moment 
constraints. This problem, an extension of the celebrated “information geo¬ 
metric” problem of /-divergence minimisation has been extensively studied 
in the literature. We rely upon those results in the form presented by Csiszar 
and Matus [8] and we use the basic framework of Breuer and Csiszar [5] pre¬ 
sented in Section [2j 

The new results are presented in Section 3. Theorem 1 in Subsection 13.II 
extends a result of Ahmadi-Javid [U Theorem 5.1] on computing the infimum 
in ([T|) for r of form Q to our frameworl0 that admits also non-autonomous 

^This framework does include some assumptions, adopted for other purposes, which 
were absent in [T]. 
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integrands and unbounded payoff function X. Our main result, Theorem 2 in 
Subsection [321 addresses the worst case and almost worst case distributions 
(densities) that attain or almost attain the minimum in ([T|). The almost 
worst case densities are shown to cluster, in Bregman distance, around a 
specihed function called worst case localiser. A similar result is obtained 
also for problem (|2|). The worst case localiser equals the worst case density 
if the minimum is attained, while otherwise it is perhaps not a density at 
all. Finally, Subsection 13.31 addresses the effect of the threshold /c in (|3]). 
Theorems [3] and 0] show that in many situations, including the case of /- 
divergence balls, either a worst case distribution exists for all /c > 0 or else 
it does/does not exist for k less/larger than a critical value kcr > 0. It 
remains open whether a similar result also holds in general—apart from the 
possibility demonstrated by an example with Bregman balls that no worst 
case distribution exists for any k > 0. 

2 Preliminaries 

2.1 General framework 

Let Q be any set equipped with a (finite or u-finite) measure fj, on a a- 
algebra not mentioned in the sequel. Probability measures P <C /r will be 
represented by their densities p = dF/dp. The notation p will be used also 
for nonnegative (measurable) functions on If which are not densities, i.e., 
do not have integral 1. Equality of functions on Q will be meant in the 
/r-almost everywhere (/U-a.e.) sense. 

Let iL be a convex integral functional dehned on the vector space of 
measurable function^ on 11 by 

H{p) = Hp^^{p) := [ I3{r,p{r))p{dr). (4) 

Jn 

Here /3(r, s) is a function of r G IL s G M, measurable in r for each s G M, 
strictly convex and differentiablqj in s on (0,+oo) for each r £ Q, and 
satisfying 

/?(r, 0) = lim/3(r, s), /3(r, s) := Too if s < 0. (5) 

stO 

^This functional will be considered only for nonnegative functions p, with no loss of 
generality since p > 0 (/i-a.e.) is a necessary condition for H{p) < +oo, see (|5|). 

^Strict convexity appears essential for our main results. Differentiability is assumed 
for convenience, it could be dispensed with as in [8]. 
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Then /3 is a convex normal integrand in the sense of Rockafellar and Wets 
|14j . which ensures the measurability of /3(r, p(r)) in ([5]) and of similar func¬ 
tions later on. 

Let X be any measurable function interpreted as payoff function, and 
Pq a default distribution on fi with Pq <C //, dPo/d/i = Po, such that the 
expectation 

E]Po(X) = f X{r)po{r)p{dr) =: bo 
Jn 

exists. Let m and M denote the p-ess inf and p-ess sup of X, and adopt as 
standing assumptions 

— oo < m < bo < M < +00 (6) 

H{p) > H{po) = 0 whenever J pdp = 1. (7) 

Due to strict convexity of 13, the inequality in ([7]) is strict ii p ^ po. 

Example 1. Take p = Pq, thus po = 1, and let f3{r,s) = /(s) be an 
autonomous convex integrand, with /(I) = 0 to ensure ([7|). Then H{p) in 
l|4|l for dP = pdp is the f-divergence Dj(P||Po), introduced in Csiszar [6]. 
If / is cohnite, i.e. if lims_^+oo/('S)/s = +oo, then P Pq is a necessary 
condition for Dj(P||Po) < +oo, hence in that case T in ([3|) equals the /- 
divergence ball {P : i7j(P||Po) < k}. If / is not cohnite, /-divergence may 
be hnite also in absence of absolute continuity. Still, with some abuse of 
terminology, the set in ([3]) will be called /-divergence ball also in that case. 

Example 2. Let / be any strictly convex and differentiable function on 
(0,-|-oo), and for s > 0 let /3{r,s) = Af{s,po{r)) where 

A/(s,t) := f{s) - f(t) - f'it){s - t). (8) 

Here /(O) and f'{0) are dehned as limits; if /(O) = -|-oo, we set Aj(s, 0) := 0 
for s = 0 and Af{s,0) := oo otherwise. 

In this example Pq <C /U is arbitrary, except that in case /'(O) = — oo we 
assume that po > 0 p-a.e.. Then H{p) equals the Bregman distance [3] 

Bf,^{p,Po) ■= [ Af{p{r),poir))p{dr), (9) 

Jn 

and r is a Bregman ball of radius k around Pq. Note that here the assump¬ 
tion /(I) = 0 is not needed to guarantee d?]), but may be adopted anyhow 
for the function Af{s,t) is not affected by adding a constant to /. 
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In the special case /(s) = slogs both examples give the /-divergence 
ball r = {P : i/(P II Pq) < k} where 


D{F\\Fo) := 


P 

plog — dp. 
Po 


As another special case, the choice /(s) = s^, s > 0 gives Ay(s,t) = (s — 
and Bf^^{p,po) := f (p — po)^dp, which is the squared L^-distance between 
p and Po- 


For r of the form ([3]) the infimum in ([T]) equals 


V(k) := inf 

p:f pd/i=l,H(p)<k 


Xpdp, 

and for q;(P) := XH{p), A > 0, the infimum in ([2|) equals 


/ 


W{X) := inf 

p:fpdtJ.=l 


j Xpdp + XH{p) 


( 10 ) 


( 11 ) 


The next lemma relates the solution of problem (|lUp to that of the following 
minimisation problem, see Fig. (T) 


F{b) := inf H{p). (12) 

p:f pdfi=l,J Xpdp=b 

F{b) is a convex function with minimum 0 attained at b = bo. A standing 
assumption will be, in addition to that 

A^max := hmF(5) > 0. (13) 

bl.m 

This is a necessary condition for the functional H to yield a nontrivial mea¬ 
sure of risk for the payoff function X, since /cmax = 0 would imply V{k) = m 
for each k > 0. Note that if m = — oo then fcmax = +oo (subject to (fT^ l. 
while if m is finite then femax < F{m) where the strict inequality is possible. 

Lemma 1. Proposition 3.1] To each k G (0, A^max) there exists a unique 
b € {m,bo) with F{b) = k, and then V{k) = b. The minimum in (fTOjl is 
attained if and only if that in dH is attained (for the above b), and then 
the same p attains both minima. 


Remark 1. The assumption on k is not restrictive, for if A: = 0 or /c > A;max > 
0 then V{k) trivially equals 5o or m. 

Remark 2. By [5l Theorem 2], the standing assumption (1131) is equivalent 
to (I24p below (which automatically holds if m > — oo), and that condition 
implies F{b) > 0 for each b < bo. In particular, the continuous convex 
function F{b), b G (m, 6o] is strictly decreasing, and V{k), k G [0,A:max) 
equals its inverse function. 
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Figure 1; Lemma [U relates problem (jlO|) to the information theoretic prob¬ 
lem (fT^ : F{V{k)) = k. 

2.2 Basic concepts and facts 

Lemma [T] admits to treat Problem (|10p using known results about minimis¬ 
ing convex integral functionals under moment constraints, specifically with 
moment mapping defined by 4>{r) := (l,X(r)). We will rely upon results in 
Csiszar and Matus [8^ . specified for this moment mapping. Then the value 
function in [8] becomes 

J{a,b):= inf H{p), (14) 

p: J pdfi=a,j Xpdfi=b 


thus F{b) = J(l, b). 

The function J in (I14p is convex, and its effective domain domJ := 
{(a, 6) : J{a,b) < -|-oo} has interior 

int dom J = {(a, 6) : a > 0, am < b < aM}, (15) 

by B Lemma 6.6]. The function J is proper (not identically -|-oo and 
never equal to —oo) because it equals zero at (l,6o) £ int dom J, see ([6]), 
©• Hence its convex conjugate J*(0i, 62 ) '■= sup^ + ^ 2 ^ — >/(o, b)] is a 
closed (i.e., lower semicontinuous) proper convex function. A crucial fact is 

^Many of these results have been known earlier, though typically under less general 
conditions. 


7 











( 16 ) 


the instance of [U Theorem 1.1] that 

J*(0i,02) =m,02) := j P*ir,9l+e2X{r))^,{dr), 

where f3* is the convex conjugate of /3, 

/3*(r,r) := sup (sr -/3(r,s)) . (17) 

sSM 

The conjugate and derivatives of f3 are by the second variable. 

Below, derivatives at 0 and +oo are interpreted as limits of derivatives 
at s I 0 and s '[ +oo. For fixed r G the function (3* equals —/?(r, 0) 
for r < /3'(r, 0), it is strictly convex in the interval (/3'(r, 0),/3'(r,+oo)), 
and equals +oo if /3'(r,+oo) is finite and r > /3'(r,+oo). This function 
is differentiable in the interval (—oo,/3'(r,+oo)). Its dervative (/3*)'(r,r) is 
positive and strictly increasing in (/3'(r, 0),/3'(r,+oo)), and approaches 0 or 
+00 as r 4 - /3'(r, 0 ) or r t P'{r, +oo). 

Since J* = K implies J** = K* , and J** (equal to the closure of J) may 
differ from J only on the boundary of dom J, 

F{b) = J{l,b) = K*{l,b) = sup [01+02^-7^(01,02)1, (18) 

81,82 

except possibly for b equal to m or M, see (1151) . This can be rewritten as 

F{b) = sup [020 - G( 02 )] = G*{b) (19) 

82 

where 

G( 02 ) := inf [ 7 ^( 01 , 02 )- 0 i]. ( 20 ) 

81 

The function G will play a similar role as the logarithmic moment generating 
function does when T in ([3|) is an /-divergence ball, see Example [3l A 
consequence of (fT^ applied to 6 = 60 ^ is the simple bound 

G(02) > 0200. (21) 

The following family of non-negative functions on O will play a key role 
like exponential families do for /-divergence minimisation: 

Veifi 2 {r)-.= {n'{Fei + e 2 X{r)), ( 0 i, 02 )G 0 ( 22 ) 

where 

0 := {( 01 , 02 ) G dom/F : 01 + 02 A(r) <//'(r,+ 00 ) /i-a.e.} . (23) 
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Remark 3. It may happen that different parameters { 01 , 62 ) € 0 give rise to 
equal functions ()22p . but only in case of functions that equal zero except for 
r in a set where X (r) is constant fi-a.e. This follows because for any fixed r € 
the fact that {l3*)'{r,T) is strictly increasing for r € (/3'(r, 0),/3'(r,+ 00 )) 
implies that Pei,d 2 {^) ™ (122]), if positive, uniquely determines 9i + 62 X{r). 
In particular, for positive valued functions (1221) the parameters { 61 , 62 ) G 0 
are always unique, due to the standing assumption ([6|). 

As { 61 , 62 ) G domK implies { 61 , 62 ) G 0 for each 61 < 61 , the sets 
dom K and 0 have the same projection to the 02-axis. This projection 
will be denoted by 02- It is a (finite or infinite) interval. The standing 
assumptions ©, © imply that 02 contains the origin, and the default 
density po belongs to the family (1^^ with 62 = 0, see O Remark 4]. The 
left endpoint of the interval 02 will be denoted by 0min- By [a Theorem 2], 
the standing assumption fcmax > 0 is equivalent to 


0min ^ 0* 


(24) 


By [S] Lemma 3.6], the directional derivatives of the function K in (|16p 
can be expressed, at any (0i,02) G 0 and for any { 61 , 62 ) G domRT, as 


lim — 
40 t 


R(01 + t(0i - 0i), 02 + t(02 - 02)) - i^(01,02) 
01 - 01 + (02 - 02 )X{r) pe^fi^{r)p{dr). 


(25) 


where the integral is well-defined and is not equal to + 00 . In particular, K 
is differentiable in the interior of its effective domain, with 


= 


(26) 

= 

J‘ ^POi ,02 

(27) 


The same equations hold at (0i, 02) G 0 on the boundary of domLC for those 
one-sided partial derivatives of K which are defined there, thus (|26p holds 
for the left partial derivative at each (0i, 02) G 0. 

The following lemma gives relevant information about evaluating the 
function G in ()2nh . Its proof is effectively contained in the proof of [5] 
Proposition 2], but for convenience a full proof will be given in the Appendix. 
Clearly, domG := {02 : G(02) < -l-C)o} = 02 . 
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Lemma 2. Given any 62 G ©2, either (i) some G M satisfies {61,62) G 0, 
f Pei,e2^h‘ ~ or (a) 61 := sup{0i : {01,62) G domi^} is finite, (0i,02) G 0, 
f pg_^ < 1 . In either case, in ( 120 ]) the minimum is attained, and the 

unique minimiser is 61 respectively 9i. 

2.3 Generalised Pythagorean identity 

Given the convex integrand (3, define as in ([SD, with the convex 

function /3(r, •) : s !->■ I3{r,s) playing the role of /. The mapping {r,s,t) 
A/ 3 (^.)(s,t) is a normal integrand [SI Lemma 2.10], hence if p and q are 
non-negative measurable functions on P then so is also A^^^ .)(p(r), q(r)), 
denoted briefly by Ap{p,q). Extending the concept of Bregman distance 
(|9]), define 



(28) 


Like its special case in Q, it is non-negative and equals 0 only if p = q. If 
/3 = Aj, as in Example [2l then is equal to the of Q. 

The following lemma, crucial for this paper, is an instance of [H Lemma 
4.15], combined with [51 Remark 4.13] 

Lemma 3. For each density p with f Xpdfi finite, and each { 61 , 62 ) G 0, 



(29) 


If P 6 »i ,02 ^ density, the special case p = Pei,e 2 of (|2^ (or direct calcu¬ 

lation) gives that 



(30) 


Then ([^ and ([30]) imply 


H{p) = H{pe^^g^) + B{p,pe^^e^)+ / |/3'(r, 0) - 6 »i - 6 » 2 A(r)l+p(r)/x(dr) (31) 


/ 


for each density p satisfying 



( 32 ) 
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Identities like ()3ip frequently occur in the literature, primarily in cases 
when the last term vanishes (it trivially does if /3'(r, 0) = —oo). They are 
referred to as Pythagorean identities!! and (12911 will be called generalised 
Pythagorean identity. 

The above results admit a short proof of the following key lemma, see 
[5l Theorem 1] for a related result. 

Lemma 4. (i) Let { 01 , 62 ) € 0, / = 1- Then f Xpe^^^dp is finite if 

and only if H{p 0 ^^ 0 fi) is. In that case the density p = Pe^fi^ wniquel^ attains 
the minimum in the definition (fT^ of F{b) for b := f Xpg^^g.^dp, and 

F{b) = H{P 0 ,^ 0 ,) = 01 + e 2 b - K{0i,02) = 02 ! - G{02). (33) 

Supposing H{p 0 ^^ 0 fi} > 0, here b is less or larger than bo according as O 2 is 
negative or positive. 

(ii) For k € (0. fe ma v). a density p attains the minimum in the definition 
(fTOjl of V{k) if and only if p = Pei ,92 for some ( 6 * 1 , ^ 2 ) G © with 6*2 < 0 and 

H{pg.^^g^) = k or equivalently J Xpg^^^g^dp = V{k). (34) 

Proof. The hrst assertion holds by (f30]l . and the second one since (fSTTl . ([32]) 
imply H{p) > H{p 0 .^fifij for each density p with / Xpdp = b = f Xpg^^^dp. 
Then (f33]l follows by (f30]l and the consequence K{ 61 , 02 ) — 0i = G( 02 ) of 
LemmaO Finally, (f^ and ([33]) yield 62 b > 02^Oj proving the last assertion 
of part (i). 

(ii) For sufficiency, it is enough to verify the equivalence (IMl) . under 
the given hypotheses. The function V : (0. fc ma v) —S' (m, 69 ) is the inverse 
of F : {m,bo) —>■ (0,/cmax)) see Lemma [T] and Remark [ 2 l This and the 
result F{f Xpg^^g^dfi) = H{p 0 ^^ 0 fi) of part (i) imply (l3l]l . because if 0 < 
^{Pei,e 2 ) < fc ma x then m < J Xpg^^^^g^dp < bo (the upper bound follows from 
O 2 < 0 , due to the last assertion of part (i)). Regarding necessity, a density 
p that attains the minimum in (IlOp clearly satishes the constraint H{p) < k 
with the equality. We skip the proof of the remaining assertion that p has 
to be of form with O 2 < 0 , for this will be an immediate consequence 

of Theorem [2l □ 

l3{r,s) = f{s) = — 1 (s > 0) then pei,e 2 = + S 2 X{r)\+ and (l3T1) reduces 

to the classical Pythagorean identity ||p|p = |bei,02lP + lb ~ provided that (I.32II 

holds for 6 \, 62 with 6 \_ + 62 X+) > 0. 

^Uniqueness is meant for the function, in the /r-a.e. sense. See Remark [3] 
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3 New results 


3.1 Calculating V(k) 

A procedure to calculate V{k) in (IlOp is to first determine the function K 
in ()16l) . then the function F via (IlSp (this may be done in two steps, first de¬ 
termining the function G in (j20p l. and finally V(k) as the solution b G (m, bo) 
of the equation F{b) = k, see Lemma 1. In regular cases, b = V{k) is char¬ 
acterised by equations involving partial derivatives of the function K, see 0 
Corollary 1], which may facilitate its computation. The following Theorem, 
combined with Lemma[21 may help to reduce computational complexity even 
in “irregular”cases. Previously, Ahmadi-Javid [U Theorem 5.1] proved an 
identity equivalent to (l3^ for autonomous integrands and bounded payoff 
functions. 

A lemma is sent forward that will be proved in the Appendix. 

Lemma 5. F* = G. 


Note that while F* = G** = clG immediately follows from (fT^ . it 
appears nontrivial that the function G is closed. 


Theorem 1. For k G (0, /cmax) 


V(k) = max max 
6»2<0 eiSM 


k + K{9u92) -ei 
02 


k + G{92) 

max--- 

02 <0 02 


(35) 


A maximiser for the seeond maximum in (l35l) is equivalently a maximiser 
of 92 b — G( 02 ) where b = V{k). A pair ( 0 i, 02 ) attains the first maximum 
in (f35P if and only if it attains the maximum in (fTsp . for b = V{k). Such 
( 6 * 1 , 02 ) belongs to 0 and satisfies f pe^^g^dp, < 1 . 

Proof. The conditions b = V{k), k & (0,/cmax) are equivalent to F{b) = 
k, b ^ {m,bo), see Lemma [U and Remark [2j The condition that 02 is a 
maximiser of 02^ ~ C(02) means, by (|19l) . that 


02b - G(02) = G*{b) = F{b) = k (36) 

or equivalently, see Lemma [U that 02^ — F{b) = F*(02). This proves that 
02 is a maximiser of 020 — G{02) if and only i0 F'_{b) < 02 < Ff{b). In 
particular, a (perhaps non-unique) maximiser 02 < 0 does exist. 

^Here F'_ and Ff denote one-sided derivatives; the differentiability of the function F 
is not addressed. 
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By (1361) . the maximum of 02 b — G(02) equals k, hence 02 b — G{02) < k 
for each O 2 € M. This proves the assertions that the maximum of 


k + G{02) 
02 


{02 < 0 ) 


(37) 


is equal to b = V{k), and a maximiser of (13711 is equivalently a maximiser 
of 02 b — G{02)- The remaining assertions of Theorem [1] immediately follow 
from this and Lemma [2j □ 


The calculation of W (A) in pil) is somewhat less costly than that of 
V{k). It requires the calculation of G{02) only for a single value of 02 , since 
for A > 0 we have (using Lemma E] in the final step) 

W{\) = inf[6 +AF(6)] = -Asup[--^ - T(6)] 
b fe A 

= -AF-(-i) = -AG(-i). (38) 



Figure 2: The supporting line has maximum slope b = {k + G{02))/02 among 
all lines passing through (0, —k) and some point of G. This slope is equal to 
the solution V{k) of problem (|10l) . 

Remark 4. The following geometric interpretation of the proof of Theorem [T] 
deserves emphasis, see Fig. [2l Denote by 

g-.= {{ 02 ,G{ 02)):02 ^Q 2 , 02 <^} (39) 
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the graph of G restricted to nonpositive arguments. Recall that, by Lemma[5l 
G is a closed convex function with G(0) = 0. Then (l37)) is the slope of the 
straight line through ( 0 , —/c) and ( 6 * 2 ,G( 02 )) € G, which is maximised by 
the supporting line to G through {0,—k). The proof of Theorem [ 1 ] shows 
that this supporting line (exists and) has slope b = V{k). The maximum 
b = V{k) of (|37|) is attained if and only if ( 02 ,G( 02 )) is on this supporting 
line. 

3.2 Almost worst case distributions 

For the problem (fTOl) . call a density p an {e-'y)-Almost-Worst-Case-Density 
(AWCD), where e > 0, 7 > 0, if 

H{p)<k + 'y and J Xpdp < V{k) + e. (40) 

Thus, an (e- 7 )-AWCD is a density which does not violate the constraint 
H[p) < k by more than 7 and for which the expected payoff does not 
exceed by more than e the worst possible one subject to the constraint. A 
worst case density (WCD) is a (O-O)-AWCD. 

An (e- 7 ) almost worst case distribution or a worst case distribution is a 
distribution P whose density is an (e- 7 )-AWCD or a WCD. 

Theorem [ 2 ] below establishes a clustering property of the (e- 7 )-AWCDs, 
as well as a similar result for densities that almost attain the minimum 
in m- From a practical point of view, this may be relevant for efficient 
hedging against the almost worst scenarios, but this issue is not entered 
here. 

Let us assign to each 62 € ©2 the unique 61 attaining the minimum in 
the dehnition (f20]l of 0 ( 62 ), determined in Lemma [U and denote 

qe^ir) := pe^^^ir), with Oi attaining Ar( 6 'i, 6 ' 2 ) - 0i = G{ 02 ). (41) 

Given k G (0. fe ma ^). we will denote by qk the function qo^ with 6*2 < 0 
attaining the second maximum in Theorem (TJ i.e., 

Qk ■= qe 2 = P 6 »i, 6»2 with {9i, 62 ) a maximiser in ([3^ (42) 

Theorem 2. (i) For k G (0, /cmax); each {e-^)-AWCD p belongs to the 
Bregman neighborhood of radius (j — 02 ^) of qk in (H2]) . i.e., see (1281) . 

B{p,qk) < 7 ~ if P is an {e-'y)-AWCD. (43) 
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(a) For A > 0 with —1/A € 02 , set 62 '■= —1/A. Then for each density p 

J Xpdp + \H{p) > ll^(A) + \B{p, qg,). (44) 

Corollary 1. Let {pn] be a sequence of {en-^n)--^WCDs with €„ —)• 0, 7 n —>■ 0 
in case (i), or a sequence of densities with f Xpndp + \H{pn) —>■ in 

case (ii). Then pn converges to qk respectively to qg^ locally in measure^ In 
particular, the function q^ is unique. 

Proof, (i) By the generalised Pythagorean identity Lemma El applied to 
01,02 in (USD, 

H{p)> 01 + 02 J Xpdp, - K{ 0 i, 02 ) + B{p,pg^fi.^) 

= 02 y Xpdp - G{02) + B{p, qg^), (45) 

for each density p. As 02 attains the maximum in Theorem [H here qg.^ = q^ 
and G( 02 ) = 02 V{k) — k. Hence, implies (E3]l for each density p which 
is an (e- 7 )-AWCD, thus satisfies (|4U]1 . 

(ii) In this case, (|45]l holds as before. Multiplying it by A = — 1/02 and 
using that —AG(— 1 /A) = W{X), see (ESI) , we obtain (HID . 

The Corollary follows since B{pn,qk) 0 implies convergence of pn to 
qg^ locally in measure [SI Corollary 2.14]. □ 

Remark 5. Corollary [T] extends the known resnlt that qk is a generalized 
solution of problem (fT^ in the sense [8] that densities pn with f Xp^dp, = 
b = V{k), H{pn) —7> F{b) = k converge to qk locally in measure, and also 
establishes its (new) counterpart for problem (fTO]) . 

The fnnction tjk in TheoremEKi) will be called worst case localiser, for the 
almost worst case densities are clnstering in its (Bregman) neighborhood. 
This nice intuitive interpretation of the function (jk = P 6 »i, 6»2 is complemented 
by the additional intuitive fact that, by (H3D . its parameter 02 controls the 
radius of that neighborhood. Most appealing is the special case 7 = 0 
of (|4^ that all densities that satisfy H{p) < k and yield expected payoff 
not exceeding the worst case by more than e, are contained in a Bregman 
neighborhood of qk of radius proportional to e, with proportionality factor 
—02. The essence of the Corollary is that the Bregman distance of pn from 

®This means that y{{r £ C : \pn{r) — 502(^)I > ^}) 0 for each C C H with y{C) 

finite, and any e > 0. If /r is a finite measure, this is equivalent to standard (global) 
convergence in measure. 
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Qk goes to 0. For certain choices of the integrand (3 this implies convergence 
even in a stronger sense than locally in measure, see Example [3j 

Clearly, the worst case localiser coincides with the WCD whenever the 
latter exists (apply (H3l) to e = 7 = 0). A necessary and sufficient condition 
for the existence of a WCD is given in LemmaHKii). There, we have skipped 
the proof that a WCD has to be of form P 6 »i, 6»2 with 62 < 0, which is obvious 
now as the WCD is a worst case localiser. Lemma [ 6 ] below will also be 
useful in identifying situations when the worst case localiser is actually a 
WCD. Its Corollary addresses the simplest such situation. When in (1101) 
the minimum is not attained, the worst case localiser may or may not be a 
density, see the examples below, though it always satisfies f qkd^ < 1 , see 
Theorem [TJ Note that the computation of the worst case localiser is not 
harder than the computation of V{k) along the lines of Subsection 13.11 for 
that calculation does provide the parameters 61,62 of qk = Pei ,02 that attain 
the double maximum in Theorem [H 

Lemma 6. A function Pei,e 2 ()22p with 62 < 0 is the worst case lo¬ 
caliser qk for k G (0, femax) if o,nd only if the vector (1 — / pg^^e^dp, V{k) — 
f XpQ^^Q^dp) € belongs to the normal cone ofdomK at ( 6 * 1 , 02 ); i-e., for 
each ( 01 , 02 ) € domiL 


(0'i - 0i) 


P0 i ,62 dp 


+ (02 — 62 ) 



Xpg^^g^dp < 0 . 


(46) 


Corollary 2. If the worst case localiser qk = Pei,e 2 das parameters (0i,02) 
in the interior of dom K, then it is a WCD. 

Proof. By Theorem [H the condition in the definition (1421) is equivalent to 
the condition that ( 0 i, 02 ) attains the maximum of f { 61 , 62 ) ■= 0i + 62 b — 
A"( 0 i, 02 ) where b = V{k). The latter is satisfied if and only if for each 
{ 61 , 62 ) € dom AT, the concave function 

f{t) := f{ 6 i + t(0i - 0 i), 02 + t(02 - 02)), 0 < t < 1 

is maximised by t = 0, i.e., its (right) derivative at t = 0 is nonpositive. On 
account of (l2^ . that condition is equivalent to (|46]l . The Corollary follows 
since the normal cone of dom K at an interior point consists of (0,0) alone. 
Thus Lemma [ 6 ] gives the conditions f pg.,^fi.^dp = 1,/ Xpg.^^^g^dp = V{k), 
which mean by Lemma |4] that P 0 i ,02 is a WCD. □ 
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Example 3 . Let H(p) be the /-divergence, formally (see Example 1 ) let (5 
be the autonomous integrand given by /(s) = slogs and let Pq = //. Then 
/*(r) = /L(0 i,02) = / 

G{62) = min[//(6»i,6'2) - ^i] = log f := A(6'2), 

^1 J 

for 02 G ©2 = dom A. The minimum in the definition of G(02) is attained 
for 0 1 = 1 — A(02). If m > —oo and X{r) = m on a set of /U-measure /io > 0 
theio A^max = — log/Toj Otherwise /max = +oo (assuming (fMD i. 

For k G ( 0 ,/max), Theorem [1] gives V{k) = max02<o[A: + A(02)]/02 and 
Theorem [ 2 ] gives the worst case localiser = exp(—A(02) + O2X), with 02 
attaining the above maximum. This worst case localiser is always a density. 
It also satisfies H{qk) = / and hence is actually a WCD, except for the 
case when dom A contains its left endpoint 0 min, A'( 0 mm) is finite, and / > 
H{qg^.^). In that case the maximiser in Theorem [H equal to the parameter 
of the worst case localiser, is 62 = 0 min- Note that for this example, the 
formula for V{k) appears in [ 1 ] and [ 5 ] show that in the above exceptional 
case the minimum in (|lUp is not attained. The result of Theorem [ 2 ] appears 
new even for this special case. 

In this example, the Bregman distance (I 28 p of densities coincides with I- 
divergence, hence Theorem[ 2 ]gives D{p\\qk) < 7—026 for each (e-7)-AWCD p. 
In the Corollary of Theorem [21 now the almost worst case densities converge 
to the worst case localiser in a much stronger sense than in measure. Indeed, 
the result that their /-divergence from the worst case localiser approaches 0 
is stronger than Li(/u) convergence to the worst case localiser. 

Example 4 . Again in the setting of Example 1 , take now /(s) = — logs. 
Then H{p) equals reverse /-divergence, i.e., the /-divergence of the default 
distribution Pq = /i from the distribution P with density p. As / is not 
cofinite, the standing assumption /max > 0 holds if and only if m > — 00. 

Take specifically fl = (0,1), X{r) = r, and take for p = Fq the distri¬ 
bution with Lebesgue density 2r. As /*(r) = —1 — log(—r) (r < 0), then 
A'( 0 i, 02 ) = /o^[-l - log(-0i - 02 ^)] 2 rdr and Pe^M'^) = l/(-6'i - G 2 r) for 
(01,02) G 0 = dom// = {(01,02) : 01 < 0, 01 -b 02 < 0}. Simple calculus 
shows that for —2 < 02 < 0 the minimum in the definition of G'( 02 ) is at¬ 
tained for 01 such that Pei,e 2 = Qd 2 1® ^ density, but the functions G and qg^ 
can not be given explicitly. If 02 < —2 then this minimum is attained for 

®Note that Theorems [Tl[2] do not apply to A: = fcmax. Here, in that case the WCD equals 
l/po on the set {r : X{r) = m} and 0 elsewhere. It does not belong to the family (1221) . 
and the almost worst case densities do not cluster in its Bregman neighborhood. 
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9i = 0, and G'( 6 * 2 ) = —5 — log(— 02 )- One sees that H{qg^) ranges from 0 to 
log2 —1/2 as 62 ranges from 0 to —2. Hence in case k < log2 —1/2 the WCD 
exists, it equals that ^ 6 * 1,02 which is a density and satisfies = k. In 

case k > log2 — 1/2 the worst case localiser is = — 1 / 02 ?’ with 02 < 
attaining V{k) = max 02 <o[^ + G'( 02 )]/ 02 . By simple calculus, this maximiser 
is 02 = —the maximum is V{k) = and the worst case lo¬ 
caliser is qkii") = which is not a density unless k = log2 — 1/2. 

In this case the Bregman distance (|28p is 


B{p,q) 


logU^-1 

p q 


dp. 


The Corollary of Theorem [2] now does not admit a substantial strengthen¬ 
ing, for the result that this Bregman distance approaches 0 does not imply 
convergence in a familiar sense stronger than in measure. 


Example 5. Let fl, X, /r be as in Example [H but this time let the default 
distribution Pq be the uniform distribution whose /r-density is po{r) = 
Take for H{p) the Bregman distance B{p,pq) in Example 01 i.e., the integral 
functional (jH) with (3{r,s) = Af{s,po{r)) = — logs — log( 2 r) -|- 2r(s — ^). 
Then (3*{r,T) = log2r — log(—r -|- 2r), (/3*)'(r, s) = l/(—r -|- 2r), r < 2r. 
The set 0 (equal to domiL) of this example consists of those ( 0 i, 02 ) for 
which ( 01,02 — 01 ) belongs to the set 0 = domiC of Example 01 Moreover, 
for such ( 01 , 02 ) the function P 0 i, 02 (?’) = l/[“^i “ (^2 — 2 )r] coincides with 
the function of Example 01 which can not be a density if 02 < 0. 

This proves that in the present Example no WCD exists for any k > 0. 


3.3 Effect of the threshold k on the existence of a WCD 

This subsection addresses the effect of the choice of the threshold k on the 
worst case localiser, in particular on whether that localiser is also a WCD. 

Examples [3l 01 and [5l demonstrate that a WCD may exist for all or for no 
k, or there may exist a critical value kcr such that a WCD exists if A: < kcr 
but does not exist if A; > kcr- It appears a plausible conjecture that these 
three alternatives are exhaustive, i.e., that if a WCD exists for some k, it 
also exists for each k' < k. While this conjecture remains open in general, 
it will be proved under conditions that cover many typical cases. 

Recall that 0min with — 00 < Oram < 0 denotes the left endpoint of the 
interval 02 , the projection of domiL to the 02 axis. The condition m > —00 
is necessary for A;niax < +00 and sufficient for 0 min = — 00 . 
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Theorem 3. (i) If for some k € (0, fc ma v) the worst case localiser is a 
density, it is a WCD for k unless^^l 

dmin G © 2 ) ^^(^min) > ~00 (47) 

and 

k > kcr ■= —G(0niin) + ^minG^(^min)- (48) 

If (|17)l and (|^ 8 ]l hold then = ^^min WCD exists for k. 

(a) IfdomK contains the 9i-axis, i.e., f f5*{r,9i)yi{dr) is finite for each 
9\ € R, then the worst case localiser q^ is cl density and hence it is a WCD 
unless (HTl) and (H 8 l) hold. 

Proof, (i) Suppose qk = Pei ,82 is a density 

for k if and only if 

j Xpe^,B 2 dp = V{k). 

Since Pei,e 2 is a worst case localiser and f Pei,e 2 - 

{92 - 92 ) {v{k) - j Xpe,,e,dp^ < 0 for h € 62 . (50) 

This immediately implies (H9l) if 02 7 ^ 0min) or equivalently (see Remark 0]) 
if the supporting line through (0, —k) to the curve Q does not contain 
(0min) G(0min)). This is always the case if (fT7|) does not hold, and also 
when (07)) holds but G'(0min), the largest slope of supporting lines to Q 
at (0min, G(0min)), is less than [k + G(0min)]/0min- As the last condition is 
eqnivalent to A: < kcr, only the case k = k^ remains to cover to complete 
the proof that q^ is a WCD unless (|17)l and (fl8|) hold. 

In that remaining case, qk = P 0 i ,02 with 02 = 0min, and instead of (l4^ 
only the inequality f XpQ.^^g.^dp > V{kcr) follows from (fSOjl . Suppose indi¬ 
rectly that it is strict, then Lemma [U implies that f Xpo^^^g^dp = V{k) for 
some k G (0, kcr) (as the integral is less than bo by Lemmad]). This means by 
LemmalDthat Pei,e2 (with 02 = 9 mm) is a WCD for k. Hence, by RemarklH 
the supporting line through ( 0 , —k) to the curve Q contains ( 0 min, <^(0111111)), 
contradicting the fact that among the supporting lines to t? at ( 0 min, <^( 001111 )) 
the one through (0, —kcr) bas the largest slope. This contradiction proves 
that ()49p holds and hence qk is a WCD also when k = kcr. 

The last assertions of part (i) are obvious. Indeed, if (j47)l holds and 
k > kcr then the supporting line through ( 0 , —k) to Q meets the curve at 

^°Here G'(0min) means the right derivative. 


. By Lemma m it is a WCD 

(49) 

= 1, Lemma [6] gives that 
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(^min, G(0niin))) just as the Supporting line through (0, does, hence 
Qk = Q'enlin ~ ^ WCD for /ccr, it can not be a WCD for k > kcr 

with V{k) < V{kcr)- 

(ii) The hypothesis implies that a vector in can belong to the normal 
cone of <iom.K at some (^ 1 , 02 ) only if the first component of this vector is 0. 
On account of Lemma[ 6 l this proves that the worst case localiser = P 6 »i, 6 » 2 ) 
for any k G ( 0 , has to satisfy f pg^^g^dfj, = 1 . □ 

Corollary 3. The function G is differentiable at eaeh 02 € (^minj 0) for 
which qg^ in fjlT]) is a density. 

Proof. Suppose indirectly that the curve G has several supporting lines at 
( 02 ,G( 02 )), say one containing ( 0 ,—/ci) and another ( 0 ,—/C 2 ), where ki 7 ^ 
k 2 . Then q^^ = qk 2 = 002 ^ Remark 01 hence qg^ is the WCD both for 
ki and k 2 , by Theorem [3l This means that V{ki) = f Xqg.^dyi = V{k 2 ), 
contradicting ki k 2 . □ 

Finally, we discuss for /-divergence balls, see Example 1, the dependence 
on the threshold k (the “radius” of the ball) of the worst case localiser and 
whether it is a WCD. Formally, let /3{r,s) = /(s) be an autonomous inte¬ 
grand, / strictly convex and differentiable on (0,-|-oo), /(O) = limsiof{s), 
/(I) = 0 , let /i be a probability measure, and Pq = T- 

The case of cofinite / is covered by Theorem [Sjii), the integral in its 
hypothesis being equal to f*{0i), finite for each 9i G M. Therefore we focus 
on the non-cofinite case, supposing 

lim = c, c finite. (51) 

st +00 s 

Then the standing assumption femax > 0 (equivalent to 0min < 0) holds if 
and only if m > — 00 . With no loss of generality, assume that m = 0 (clearly, 
the minimisation problem (| 10 p is not affected by adding a constant to X). 

Under the above assumptions, iF(0i, 6 * 2 ) = f f*{9i + 92X)dfi with 02 < 0 
is finite if 0 i < c and infinite if 0 i > c, because ([5T]l implies that /*(r) is finite 
for r < c but not for t > c. It follows for any 02 < 0 that the associated 0i 
in Lemma [2] is equal to c, hence Lemma [2] gives that the function qg^ = P 6 »i, 6»2 


in (|4ip is a density if and only if 


9 ( 02 ) := j{rnc + 92X)dfi>l. 

(52) 

Moreover, if §{02) < 1 then 


q92 = iry{c + 92X)=Pc,9,. 

(53) 
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Theorem 4. Under the assumptions (Em and m = 0 , if 5 ( 02 ) = +00 for 
each 62 < 0 then the WCD exists for all k € (Oj/cmax)- Otherwise, denote 


0 mm ■■= mf{6'2 : 9 ( 62 ) > 1}, 


(54) 


~kcr ■■= LinG’+{0min) - G{e^in)- 


(55) 


Then 0min € (—cxd,0), kcr € (0, fcjnax); ond for k < kcr the WCD exists. 
For k > kcr the worst case localiser is of form (1531) : it is not a density 
if k > kcr, while for k = kcr it is a density (and hence a WCD) unlesk^A 
9 ( 02 ) < 1 for each 02 < 0 with 9 ( 02 ) < + 00 . 


Proof. Since in the current case 6 *min = — 00 , Theorem [3] implies that the 
worst case localiser is a WCD if and only if it is a density. By the passage 
preceding the Theorem, the latter holds if and only if qk = with 62 
satisfying ()52p . This immediately proves the first assertion. 

Suppose next that 5(6*2) is finite for some 62 < 0, and denote the supre- 
mum of such parameters 62 by a. One verifies via monotone convergence 
and dominated convergence that 5(^2) is a continuous, strictly increasing 
function of 02 € {— 00 , a) that approaches 0 or g{a) as 62 goes to —00 or a. 
Hence if 9 {a) > 1 then 0min is equal to the unique 02 < cr with 5 (^ 2 ) = 1, 
whereas if g{a) < 1 then 0mm = cr. In both cases — 00 < 0mm < 0, using 
that 5(0) = + 00 . 

By the definition ([551) of kcr, the supporting line to Q at (0min, (-^(^min)) 
of slope G'_^_{9mm) intersects the vertical axis at {0,—kcr). Hence kcr > 0, 
unless the function G is linear in the interval [6*min)0]; the latter possibilty 
will be ruled out in the Appendix. It follows, too, that the supporting line 
to G through (0, —k) with k < kcr 01 k > kcr meets the curve ^ at a point 
(or points) with argument 6*2 > 0mm respectively 02 < 0min- Moreover, the 
latter inequality is strict if 5 ( 0 min) = 1 (equivalent to g{a) > 1 ), for in that 
case G is differentiable at 9mm due to Corollary [3l 

Referring to Remark [H the above considerations prove that the param¬ 
eter O 2 in the representation q^ = ™ satisfies or does not satisfy the 

condition (j52p if A: < kcr respectively k > kcr, no matter whether g{cr) > 1 
or not. These facts, and that qk = Qo if A: = kcr, imply all remaining 

^min 

assertions of the Theorem, see the first passage of the proof. □ 

is left open whether that exceptional case is possible. 
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Appendix 

Proof of Lemma [2] 

Proof. Fix 02 € 02, define 9i as in the lemma. Then { 01 , 02 ) S © for 
all 01 < 01 , and the function f{ 0 i) := K{ 0 i, 02 ) is convex, closed, and 
differentiable in its effective domain (—oo,0i), with 

f'{Gi) = JPei,e 2 dl^, Gi < di, (56) 

see (l26]) . If { 0 i, 02 ) & & then ([561) holds also for the left derivative at 0i = 0i. 
Hence the last assertion of the Lemma immediately follows. 

To prove that one of the alternatives (i) and (ii) indeed takes place, note 
that the properties of (3* stated in the passage after (flTll imply, by monotone 
convergence, that f'{0i) in ([56]) goes to 0 if 4- —oo and to +oo if 0i = +oo 
and 01 t + 00 . Hence, due to continuity of f'{0i), alternative (i) fails only if 

JPeifi 2 dp < 1 for all 0i with {0i, 02 ) € 0, (57) 

and ([57t) can hold only if 0i < +oo. Further, ([571) implies that { 0 i, 02 ) G 
dom K, for in the opposite case f{0i) = + the derivative ([561) of the closed 
convex function f{ 0 i) would go to +oo as 0 if 0 i. 

The proof will be complete if we show that (1571) implies { 0 i, 02 ) € 02- 
It has already been shown to imply { 0 i, 02 ) G dom K, in particular, that 
01 + 02X{r) < l3'{r,+oo) iJ.-a.e., thus it remains to verify, see (123[) . that 
the set {r : 01 + 02X{r) = /3'(r,+oo)} has /i-measure 0. On that set, 
P 6 »i, 02 (^) = iP*)'{r,0i + 02X{r)) grows to +oo as 0i f 0i. Hence, were it not 
a 0-measure set, J would grow to -(-oo, contradicting ([57[) . □ 

Proof of Lemma [5] 

Proof. Fix 02 G M and consider the (not necessarily proper) convex function 
L{a) := inf (J(a, b) — 02 b), a G M. 


Then 


F*(02) = sup(02& - F{b)) = - mf{F{b) - 02b) = -L{1) 

b b 

= -L**{1) = - sup(0i - L*(0i)) = inf(L*(0i) - 0i), 

6»i 
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where the third equality holds since F{b) = J(l, 6 ), and the fourth one holds 
since a = 1 is in the interior of dom L. Here 

L*{6i) = sup(0ia — L{a)) = sup[0ia + sup(—J(a, b) + 62b)] 
a a b 

= r{ei,92) = K{e,,92). 

Recalling the dehnition (120)) of G, this completes the proof. □ 

Completion of the proof of Theorem |4] 

It remains to rule out the possibility that the function G is linear in the 
interval [0min,O]. Suppose indirectly that for some 6 G M 

G{92) = 92 b if 92 G [9 min? 0], (58) 

Here necessarily b < bo, hy (I21|) . As ()58l) implies G*{b) = 0, which means 
by (fT9l) that F(b) = 0, it follows by Remark 2 that actually b = bo. 

As g{9mm) < 1 by the proof of Theorem H) the value 9i in (j4T)) attaining 
K{9i, 92) — 9i = G{92) for 6*2 = ^min is equal to c. Thus Lemma [3] applied 
iop = po and ( 01 , 6 ( 2 ) = (c, 0min) gives 

0 = H{po) > 0 min j Xpodp - G( 0 min) + B{po,qg^.J. 

Here the integral equals bo by definition, and b in (|58p has been shown to 
equal 60 • Hence it follows that B{po,qn ) = 0, which means that q^ 
equals po = I (/r-a.e.). By Remark [3l this contradicts 0min 7 ^ 0, proving that 
the indirect assumption ()58]) is false. 
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